A Lexicalized Tree-Adjoining Grammar for Vietnamese
نویسندگان
چکیده
In this paper, we present the first sizable grammar built for Vietnamese using LTAG, developed over the past two years, named vnLTAG. This grammar aims at modelling written language and is general enough to be both applicationand domain-independent. It can be used for the morpho-syntactic tagging and syntactic parsing of Vietnamese texts, as well as text generation. We then present a robust parsing scheme using vnLTAG and a parser for the grammar. We finish with an evaluation using a test suite.
منابع مشابه
Automated Extraction of Tree Adjoining Grammars from a Treebank for Vietnamese
In this paper, we present a system that automatically extracts lexicalized tree adjoining grammars (LTAG) from treebanks. We first discuss in detail extraction algorithms and compare them to previous works. We then report the first LTAG extraction result for Vietnamese, using a recently released Vietnamese treebank. The implementation of an open source and language independent system for automa...
متن کاملTree-Adjoining Grammars Are Not Closed Under Strong Lexicalization
A lexicalized tree-adjoining grammar is a tree-adjoining grammar where each elementary tree contains some overt lexical item. Such grammars are being used to give lexical accounts of syntactic phenomena, where an elementary tree defines the domain of locality of the syntactic and semantic dependencies of its lexical items. It has been claimed in the literature that for every tree-adjoining gram...
متن کاملVerification of Lexicalized Tree Adjoining Grammars
One approach to verification and validation of language processing systems includes the verification of system resources. In general, the grammar is a key resource in such systems. In this paper we discuss verification of lexicalized tree adjoining grammars (LTAGs) (Joshi and Schabes, 1997) as one instance of a system resource, and as one phase of a larger verification effort.
متن کاملEncoding Frequency Information in Lexicalized Grammars
We address the issue of how to associate frequency information with lexicalized grammar formalisms, using Lexicalized Tree Adjoining Grammar as a representative framework. We consider systematically a number of alternative probabilistic frameworks, evaluating their adequacy from both a theoretical and empirical perspective using data from existing large treebanks. We also propose three orthogon...
متن کامل